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Abstract 

The automation of rotorcraft low-altitude flight presents challenging problems in control, 
computer vision and image understanding. A critical element in this problem is the ability 
to detect and locate obstacles, using on-board sensors, and modify the nominal trajectory. 
This requirement is also necessary for the safe landing of an autonomous lander on Mars. 
This paper examines some of the issues in the location of objects using a sequence of images 
from a passive sensor, and describes a Kalman filter approach to estimate range to obstacles. 
The Kalman filter is also used to track features in the images leading to a significant reduction 
of search effort in the feature extraction step of the algorithm. The method can compute 
range for both straight line and curvilinear motion of the sensor. An experiment is designed 
in the laboratory to acquire a sequence of images along with the sensor motion parameters 
under conditions similar to helicopter flight. The paper presents range estimation results 
using this imagery. 

1 Introduction 

Rotorcraft operating in high-threat environment fly close to the earth’s surface to utilize 
surrounding terrain, vegetation, or man-made objects to minimize the risk of being detected 
by the enemy. Increasing levels of concealment are achieved by adopting different tactics 
during low-altitude flight. The piloting of the rotorcraft is, at best, a very demanding task 
and the pilot will need help from on-board automation tools in order to devote more time to 
mission-related activities. The development of an automation tool, which has the potential 
to detect obstacles in the rotorcraft flight path, warn the crew, and interact with the guid- 
ance system to avoid detected obstacles, presents challenging problems. 

The planning of rotorcraft low-altitude missions cam be divided into far-field planning 
and near- field planning [1]. Far-field planning involves the selection of goals and a nominal 
trajectory between the goads. Far-field planning is based on a priori information and requires 
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a detailed map of the local terrain. However, the database for even the best surveyed land- 
scape will not have adequate resolution to indicate objects such as trees, buildings, wires 
and transmission towers. This information has to be acquired using an on-board sensor 
and integrated into the navigation/guidance system to modify the nominal trajectory of the 
rotorcraft. Because vision alone will not be adequate for detecting small obstacles such as 
wires, it is expected that the system will include an active sensor whose search can be di- 
rected to complement the vision system while minimizing the risk of detection [2]. Initially, 
passive imaging sensors such as forward looking infrared (FLIR) and low-light-level-television 
(LLLTV) will be considered in order to assess the operational potential of passive methods. 

The recovery of depth using electro-optical sensors, referred to as passive ranging, is 
based on triangulation and requires two images of the outside world from two different imag- 
ing conditions. In stereo methods, two or more cameras located at different positions are 
used to obtain images of the outside world. In motion methods, the same camera is moved 
from one position to another to capture two or more images of the outside world. Passive 
ranging has been the subject of considerable study in computer vision [3, 4]. Several papers 
have described the use of a sequence of images to determine the orientation and position of a 
rigid body [5], recover both vehicle motion and location of surrounding objects [6], compute 
range by matching vertices of man-made objects [7], and estimate range for linear motions 
of the vehicle [8, 9]. Krotkov [10] has used focusing to compute the distance to objects lying 
in the range 1 to 3 meters. Recently, Ma and Olsen [11] have proposed the acquisition of a 
monocular image sequence by changing the focal length of the camera and using the resulting 
images for range computation. 

The passive ranging methods described earlier have evolved from the needs of the Au- 
tonomous Land Vehicle (ALV) and other ground-based robotic applications. Road following 
is the key guidance function in the ALV whereas the ability to maneuver around obstacles 
is the challenge for guidance in low altitude flight. The rotorcraft flight at low-altitude has 
several distinct characteristics: (i) the nature of the scenarios encountered outdoors makes 
methods dependent on model based vision such as matching verticies of man-made objects 
not applicable, (ii) due to the curvilinear motion of the rotorcraft, a large class of passive 
ranging algorithms designed for linear flight are not directly applicable, (iii) the objects of 
interest during a rotorcraft flight may vary from 50 to 1000 feet resulting in a large variation 
in the optical flow in the image, and (iv) the availability of the sensor motion parameters 
from an inertial navigation system. The approach used in this analysis is geared to the re- 
quirements of low altitude rotorcraft flight and differs from general motion analysis methods 
in significant ways. The distinguishing features of our approach are: (i) we do not attempt 
to estimate the rotorcraft’s motion from the images, (ii)the algorithms are designed to be 
recursive, (iii) the algorithms can handle curvilinear rotorcraft motion, (iv) feature tracking 
and range estimation are done incrementally to reduce the search space as well as to discard 
false matches, and (v) a Kalman filter formulation allows the use of several optical sensors 
and it provides for a natural way of integrating stereo and motion methods. The rotorcraft 
parameters (angular rate, translational velocity, position and attitude) are assumed to be 
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computed using an on-board inertial navigation system. Given a sequence of images, using 
image-object differential equations, a Kalman filter [12] can be used to estimate both the 
relative coordinates and the earth coordinates of objects on the ground. The Kalman filter 
can also be used in a predictive mode to track features in the images, leading to a significant 
reduction of search effort in the feature extraction step of the algorithm. The performance 
of three different Kalman filters for different rotorcraft maneuvers were examined in [12]. 
This previous study did not, however, include the processing of real images. This paper de- 
scribes the computation of the optical flow and uses the resulting optical flow in an Extended 
Kalman Filter (EKF) to estimate range. We present range estimation results for both linear 
and curvilinear motions of the camera. The experience gained from the application of this 
algorithm to real images is very valuable and it is a necessary step before proceeding to the 
estimation of range during low-altitude curvilinear flight using an Extended Kalman Filter 
(EKF). 

The paper is organized as follows: Section 2 describes the relation between the image, 
the rotorcraft and objects of interest under full curvilinear motion. Section 3 describes our 
approach to the estimation of optical flow and the resulting image track. Section 4 describes 
the recursive range estimation algorithm and Section 5 considers the performance of this 
algorithm using image data acquired in a laboratory setting. Finally, Section 6 provides a 
summary, conclusions and ideas for future work. 


2 Passive Ranging 

Passive ranging is the ability to estimate distances to various objects close to the flight path 
of the helicopter using passive sensing by electro-optical cameras. This section will describe 
the basic relations between the two-dimensional (2D) sensor image variables (displacement 
and velocity) to the three-dimensional (3D) terrain geometry (i.e. points, lines and other 
features) and sensor motion parameters (i.e. position, attitude, translational and rotational 
velocity). These relations provide the dynamic models for the estimation of range. For 
simplicity, the camera is assumed to be fixed at the center of gravity of the rotorcraft with 
its optical axis oriented along the rotorcraft’s longitudinal body axis. Figure 1 shows the 
viewing geometry of the camera. In actual practice, the camera is mounted at a convenient 
location away from the center of gravity of the rotorcraft. If necessary, the camera is allowed 
variable orientation with respect to the body of the rotorcraft. This flexibility is provided 
in our implementation of the passive ranging algorithm. 

Consider an earth-fixed, north-east-down, coordinate system. Let r h = (x/,, yh, z^) 7 
and r = (x, y, z) T be the earth coordinates of the rotorcraft and a point 0 on the ground 
respectively. The rotorcraft moves with respect to the earth at a translational velocity 

v = (v„v„v,y. 

The orthonormal coordinate transformation from the earth axes to body axes is denoted 
by the 3x3 matrix T which depends on the rotorcraft attitude . Let T a be the corresponding 
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Figure 1: Viewing geometry of the imaging sensor 

transformation matrix from body to sensor-axis system. 

The relative position, p , of any point 0 with respect to the rotorcraft can be written as 

p = r-r h (1) 

The rate of change of this vector as viewed from the moving rotorcraft can be determined 
by the Coriolis equation relating rate of change of a vector as viewed from a earth-fixed 
coordinate frame to that viewed from a moving frame. The standard vector form of the 
Coriolis equation [13] is 

P fixed ~ Pmoving "f ^ ^ 

Let p b = [z;,, j/t, Zfc] T be the vector between the rotorcraft and point 0 expressed in body- 
coordinates. Similarly, H = [14* > Vt y , Vb z ] T and u >b = [i^bx,^by,^>bz] T represent the trans- 
lational and rotational velocity of the rotorcraft in body axes. Let p, = z a ] , 

V, = [V.x, V. y , V„] T and u, = [w 4I ,u> iV ,u;„] T be the corresponding quantities in the sensor- 

axes system. 

Using equations (1) and (2) and matrices T and we can write the relation 

p t — — w, x pt — V t (3) 
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Let the image plane be perpendicular to the optical axis. Then, using similar triangles 


u = fxjz a 

v = fy./z , (4) 

where / is the focal length of the sensor. As the rotorcraft moves, the image of the object 
0 moves in the image plane. Differentiating u and v with respect to time, we have 

u = f(-x a z a + z a x a )/z] 

V = }{-y,z a + z a y a )/z 2 (5) 

Substituting equation (3) into (5) and (4) leads to the result 

ti = u t + ii r 

v = v t + v r (6) 

where 

«, = (~V ax + V az u/f)f/z a 

v t = (-V ay + V„v/f)f/z a (7) 

u r = u> tx uv/f - u >, v (u 2 // 2 + 1)/ + u az v 
Vr = W«(V 2 // 2 + 1)/ - U! lv Uv/f - 

The velocity (ti, t>) associated with each point in an image is referred to as optical flow [14] 
where (it|,t>t) denotes the translational component of the optical flow, and (u r ,v r ) the ro- 
tational component of the optical flow. The rotational component of the optical flow is a 
function of the image position only and provides no information about the location of object 
0. Thus, given the six motion parameters of the sensor (K,u>,), by computing the optical 
flow (ti,v), the translational component of the flow at a point can be used to estimate the 
distance of the point 0 along the optical axis (0,0, z a ) . For a rotorcraft flying in a straight 
line, the optical flow will be zero at a point (V ax /V az , V ay /V az ) in the image plane. This point 
(/*,/»), referred to as the focus-of-expansion (FOE), corresponds to the intersection of V az 
and the image plane and it plays an important role in optical flow computations. We do 
not use the direct approach suggested by equation (6) in our computation of z a ; instead, the 
range is computed using the more robust recursive estimation procedure outlined in Section 
4. 


Given a sequence of measurements u(k),v(k); k = 1,2, ....N, using image-object dif- 
ferential equations (3), we estimate both the relative coordinates p(k) and the earth co- 
ordinates r of the corresponding object 0 on the ground. The range estimation con- 
sists of two major parts: (a) computation of optical flow by extracting of measurements 
(u(k),v(k)) from the image and (b) estimation of range given the sequence of measurements 
u(fc), v(fc); k — 1,2, ....N. The next section will consider the computation of the optical flow. 
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3 Computation of Optical Flow 

The computation of optical flow requires the determination of the displacement of image 
points over a sequence of images. The main difficulty in the computation is due to the as- 
sumption that an object in the terrain space corresponds to a unique point in the image. In 
an actual image, an object on the ground is more likely to be a region in the image. Another 
complication in the computation of the optical flow, referred to as the correspondence prob- 
lem [15], results from the ambiguity in identifying features in two images that are projections 
of the same entity in the 3-dimensional world. 

There are two approaches to the computation of optical flow: (a) field-based techniques 
and (b) feature-based techniques. The field-based techniques assume a continuous variation 
of image intensity as a function of position and time. This approach was introduced by 
Horn [16, 17] and provides a dense map of the optical flow. However, the computational 
experience with this approach is limited to simple scenes and available algorithms based on 
this method are very susceptible to noise [18]. Recently, several new field-based methods 
have been reported in the literature [19, 20, 21, 22]. However, in this paper, we will restrict 
our attention to feature-based computation of optical flow. 

Feature-based techniques [23] make use of features in an image to measure optical flow. 
Features in an image can be points, lines, contours, regions, or any other geometrical defini- 
tion that corresponds to a distinguishable part of an object. The complexity of the algorithm 
depends on the definition of feature and the criteria used for matching. In general, robust- 
ness and computational load increase with the amount of modeling involved in the definition 
of the feature. The matching technique used in this paper is a modified version of area- 
based matching and is designed to overcome some of the common limitations associated 
with area-based matching [24]. The selection of areas for matching, based on regions of high 
pixel variance, eliminates featureless regions in the image. The use of normalized correla- 
tion makes matching independent of brightness changes from one image to the next. The 
knowledge of (V„u t ) and the estimated value of range is used to limit the search area used 
for matching. The choice of window size affects the performance of matching algorithms. A 
large window can provide more uniqueness in matching. However, a large window may also 
cover more than one object and lead to erroneous results near occlusion boundaries. Some 
authors have addressed this problem by using several window sizes in a hierarchy [25]. Our 
approach is to use a fixed size window and to discard inconsistent features (i.e. features 
which do not behave as predicted) during recursive range estimation. 

The algorithm used for computing optical flow between two successive image frames Ik 
and Ik+i consists of the following steps: 

1. The kth image frame J fc (u, v) is a discrete 2D function where the value of the function 
at each point (u, v) represents a shade of grey. In our implementation, the grey scale 
is represented by 8 bits and the image plane is 512 x 512 pixels. 

2. Each image Ik is divided into regions of n p x n v pixels. A feature, 5 P , is defined to be 
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Figure 2: Evolution of features in an image sequence 



Figure 3: Addition of new features 

a region which has a high variance. For our tests, we used regions of 9 x 9 and 11x11 
pixels. The variance threshold was selected as 5 percentage variation in intensity. 

3. The initial image is partitioned into n p x n v regions from which features are identified. 
Thereafter, at each subsequent frame, features from the previous frame are propagated 
to identify corresponding features in the current frame, possibly shifted in location. The 
evolution of features in a small segment of the image is shown in Figure 2. In addition, 
the current image is again partitioned into n p x n p regions to identify newly formed 
features. Figure 3 illustrates the procedure for adding new features. A region of high 
variance is picked as a new feature only if a major portion of it is not already covered 
by an old feature. This is done to keep the number of features to a minimum and 
is reasonable since not much additional information is gained by tracking completely 
overlapping features. A 50 percentage threshold is used in the tests. RA, RB, and 
RC are regions of high variance which qualify as features. However, only RB and RC 
are chosen as new features. At each frame the total number of features is equal to the 
features corresponding to the previous frame plus the new features. The total number 
of features varies from frame to frame and is bounded by the number of n p x n p regions 
in the image. 
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4. The feature in image I k is matched to its corresponding feature in the image I k + 1 using 
the normalized correlation coefficient. The normalized correlation , tj c , between two 
n p x n p regions A(i,j) and B(i,j) is defined by the relation: 


rip/iq - fig^b 

where 

lip Tl^, 

f^a = E£ A(i,j')] 

«=1 j=l 

np t \p 
»=1 }= 1 

»ab = EEA(t,i)B(i,i)] 
»=i j=i 

Tip np 

<r* = 

*=i j=i 

Tip Tip 

•=i j=i 


( 8 ) 

(9) 


( 10 ) 


A value of 1 signifies a perfect match between the regions. 

5. A feature in image I k +i corresponding to a given feature S p is determined by searching 
the image h+i for the region which has the highest normalized correlation with S p . 
Let u(Jfc + 1), v(k + 1) be the coordinates of the center of this region and let u(k), v(k) 
be the corresponding quantities for S p . [u(fc + 1) — u(k),v(k+ 1) — v(k)] is a measure 
of the optical flow or disparity associated with S p . 

6. The search area in image I k + 1 for finding correspondence to S p is limited to a narrow 
elliptic envelope, as shown in Figure 4, to reduce the amount of computation. The 
envelope depends on the sensor characteristics, rotorcraft motion, and the error covari- 
ance of the range estimates. As will be described in the next section, the predicted 
location of the feature S„, [u p (k + l),t )„(* + 1)], is provided by the recursive range 

estimation algorithm. [u p (Jfc + l),v p { k + 1)1 defines the center of the eUi P se ‘ The error 
covariance associated with z s {k + 1) is used to define the major axis of the ellipse. 
However, the search area is always large enough to accommodate a 30 percent error 
in z.(k + 1). The search along the major axis accounts for errors only in z,(k + 1). 
The search area is expanded to a narrow ellipse to account for errors in the location 
of u(k),v(k). The width of the search area is a compromise between the amount of 
computation for the search and the likelihood of missing a correct match. The minor 
axis of the ellipse is chosen to be 20 percent of the major axis for the results reported 
in this paper. 
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0.7 [u^+^.v^k+l)] 


[u£k+l).v^k+l)] 


Figure 4: Search area for feature matching 


7. The search step results in the location of [u(A:+ 1), v(k+ 1)] to pixel accuracy. Let (II, 
Jl) be the corresponding pixel location. We assume that tj c has a qudratic variation 
in the 8-pixel neighborhood of (II, Jl) and achieve sub-pixel accuracy by finding the 
maximum of rj c in this nighborhood. 

8. The result of the previous step is a sequence of disparity vectors or image track for 
each feature S p . 

We have used a search procedure to find the maximum of the normalized corelation co- 
efficient. The search procedure can be replaced by a number of techniques which optimize a 
function of two variables. The classical Newton method is simple and well known but subject 
to an array of difficulties in practice affecting reliability. Several variations of the Newton’s 
method are available to overcome these difficulties [26]. In general, modified Newton’s meth- 
ods provide a faster solution than the search procedure. However, this improvement may 
not be significant for the maximization of the normalized coefficient as a function of two 
variables. 

The next section will describe how the image track can be used recursively to estimate 
range corresponding to every feature in the image. 


4 Recursive Range Estimation 

Recently, recursive estimation and Kalman filters have been used by several authors to ad- 
dress problems in computer vision. Broida and Chellappa [27] have applied an EKF to 
estimate translational and rotational motion of a 2-dimensional rigid body. Wu [5] describes 
an EKF to estimate the positions and orientations of a rigid body by using a sequence of 
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images. Matthies [8] described a Kalman filter-based algorithm to estimate range using a 
sequence of images. However, this algorithm is limited to linear motions of the camera. The 
EKF has also been used to estimate the calibration parameters of the camera [28]. Wun- 
sche [29] describes a Kalman filter-based method for controlling a mobile robot by tracking 
three features. Three different Kalman filters for recursive range estimation were presented 
in [12] and their performance was evaluated using simulated image tracks. This approach 
was applied to a sequence of images for linear camera motion in [30]. Here, we present 
the algorithms and results for both linear and curvilinear motions of the sensor. The range 
estimation algorithm is described for one feature to keep the notation simple However, the 
same analysis holds for all fixed features on the ground. 

The object location estimation problem may be formulated as follows. Let a point object 
O have earth coordinates r = (i,y,z)^. The image point corresponding to this object point 
has coordinates (tt,v) T , where u and v are given by equation (4). The actual image point 
location will be different from the true value (u,u) T due to noise in the sensor and errors 
introduced by the optical flow. Let ( u m ,v m f be the measured coordinates of the image 
point, such that 

«m(*) = u(0 + n u (t) 

V m (t) = v(0 + n„(t) (11) 

where and n v represent “pixel” noise of the imaging system. n„ and n„ are assumed to be 
independent scalar white noise processes with standard deviations <7 U and <r„, respectively. 
In vector notation, measured or actual image point coordinates can be represented as 

Z(t) = h(t) + C,(t) (12) 

where 

h(t) = (u,v) T 

C*(<) = {nu,n v f 

and 

R = cov((,)s (13) 

The measured image point coordinates will move in the image plane as the rotorcraft flies 
a given trajectory. Given, estimates of rotorcraft position and velocity (translational and 
rotational) along it’s trajectory, image point measurements from successive image frames 
may be used to build a Kalman filter for recursively estimating the object point coordinates 
in earth axes (r) and in sensor axes (/>,). Because the measurements, Z, are nonlinear 
functions of the object point coordinates, r or p s , an extended Kalman filter must be used. 
The Kalman filters investigated in this paper have a linear continuous state model of the 
form , x 

X = F(t)X(t) + G(t)U{t) + C*(<) (14) 
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where X is the state vector U is the control input, £ x (f) is a continuous white noise with 
covariance Q c (representing modeling uncertainty), and F(t) and G(t) are time- varying ma- 
trices. Using a sampling interval of AT seconds, equation (14) can be replaced by the discrete 
form 

X(k + 1) = *(k)X{k) + T{k)U(k) + C x {k) (15) 

where k = t.AT, k + 1 = (* + 1)AT, i = 1,2,3..., 4>(fc) is the state transition matrix and 
T(Ar) is the input distribution matrix. The process noise Cx{k) is used to model uncertainties 
in the knowledge of V, and [w,]. Cx(k) is a discrete white noise sequence with covariance 
Q = Q c / AT. The measurements Z(t) are assumed to be available every A T m seconds where 
A T m = MAT , M being a constant positive integer. The measurements Z(t) are non-linearly 
related to the state through the vector function h(X(t)) and can be linearized to give the 
measurement equation. Thus, whenever i is an integral multiple of M , we have 

Z(k) = h[X(k)] -f Cz(k) (16) 

Given the state equations (15) and the image point measurements Z(k) of equation (16) the 
state estimate X(k) and it’s error covariance matrix P can be computed recursively using 
the Kalman filter [31]. 

The Kalman filter consists of two parts: 

1. Measurement Update : The measurement update is done whenever a new measurement 
is available, i.e., t is an integral multiple of M. Prior to processing a new measurement 
Z(k), we have the estimated value of the state X and the covariances P(k),Q(k) and 
R(k). The new measurement improves our estimate of the state and it’s covariance. 
The updated values are 

X(k) = X(k) -f K(k)[Z(k) - h(X(k))] 

P{k) = [I - K(k)H(k)]P(k) (17) 

where the matrix of partial derivatives 

H(k) = 8h{X)/dX (18) 

and the Kalman filter gain K(k) is computed using the equation 

K(k) = P{k)H T (k)[H(k)P(k)H T (k) + R(k)]- 1 (19) 

When i is not a multiple of M, 

X{k) = X(k) (20) 

P(k) = P(k) (21) 

2. Time Update : This part of the filter accounts for the system dynamics and propa- 
gates the state and its covariance matrix until the next measurement is made. The 
propagated values are: 

X(k + 1) = $(ik)A'(Jb) -(- T(k)U(k) 

P{k + 1) = ${k)P{k)<i>(k) T + T(k)Q(k)T(k) T (22) 
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The actual implementation of the Kalman Filter is done by updating the measurements 
one at a time. This results in a Kalman filter gain different from K(k). However, the final 
X and P at the end of the measurement update is the same as before. Due to the excellent 
numerical properties of the factorized form [32], the error covariance propagation is done by 
decomposing the state covariance P in the form 

P(k) = U{k)D{k)U T {k) (23) 


where U is an upper triangular matrix and D is a diagonal matrix. 

Three different Kalman filter formulations for object point location are discussed in [12]. 
They are the result of three different representations for the state vector in equations (14) 
and (15). Here, we choose the relative coordinates of the object point 0 with respect to the 
rotorcraft in sensor axes as the state vector. Thus, 



N 

* 

ill 

Cl 

III 

* 

(24) 

From equation (3), 

X = -[u.]X - V. 

(25) 


Equation (25) defines a time-varying linear system as in equation (14) with F(t) = -[£>•], 
G{t) = /, U{t) = -V. and ( x = 0. Note that u a and V, are the estimated rotorcraft angular 
and translational velocity vectors in sensor coordinates provided by the on-board inertial 
navigation system. 

The conversion of the continuous time- varying state model in equation (14) to the dis- 
crete form (15) is done assuming F(t), G(t ) and U(t) to be constant over a small interval 
of time AT. This assumption implies that the rotorcraft linear and angular velocity is con- 
stant during the time interval AT. This assumption can be satisfied by updating the state 
equations at a higher rate as is the case in most airborne inertial navigation systems. The 
conversion from the continuous to discrete form is usually done using numerical techniques 
[33]. Since F is a 3 x 3 matrix, by working in the frequency domain we can derive the discrete 
system equations analytically. This expression gives a more accurate value for $(t) than the 
numerical approximation resulting from a Taylor series expansion. 


Under these conditions, Appendix shows that 



*ll(fc) 

$«(*) 

$i 3 (fc) 


$«(*) 

M*) 

$3l(&) 

$32 (k) 

$ 32 (fc) 

$33 (k) . 


' r„(*) 

Tu(k) 

r, 3 (*) ■ 

r(fc) = 

r 2 i (*0 

T 2 2(k) 

r 3 i(fc) 

. r 3 ,(*) 

r 32 (fc) 

r 33(fc) 


(26) 


( 27 ) 
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Figure 5: Interaction between computation of optical flow and range estimation 


where and r,j are defined by equations (41)-(48). Using (16), we can write 



Z(k) = h[X(k)) + Uk) 

(28) 

where 

MA(fc)] = [/ii,/i 2 ] r = [fxJz a ,fy a /z,) T 

(29) 

The matrix 

rrry/.xi _ f dhi/dx, d^/dy, dh x /dz a 
dhi/dx, dhi/dyg 

(30) 

By taking the partial derivatives, the expression for H reduces to 



H[X(k)\ = H(p.) = l\ 1/z ' 0 

L u 1 /z a y t /z t \ P ' =h(k _ x) 

(31) 

Equations (25) and (31) can be used in the extended Kalman filter to recursively estimate 
the state X — p, = [£„ y a , z a ] T and its covariance P,. In earth coordinates, p and f can be 
computed using the equation 

p = t t tJp„ 


r = p + r h 


Similarly, P = 

T T P b T where P b = Tj P.T a . 



The interaction between optical flow calculations and recursive range estimation is shown 
in Figure 5. Assume that we want to process the image 7* to improve the range estimate 
corresponding to a feature S p (k — 1) from image /jt— i- The output of the Kalman filter based 
on measurements [u(fc — l),u(A: — 1)] provide the estimates X(k) = [x a (k),y a (k), z a (k)] T . 
Based on the state estimate, we can predict the location of S p (k) in image k to be 

u P (k) = fx a /z t 

v P ( k ) = fy>/z* (32) 
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The values [ u p (fc ), v p (k)] are provided to the optical flow algorithm to find the location of the 
corresponding feature at [u(fc), v(fc)]. Generally feature matching algorithms would require 
the derotation of the image Ik+i to account for the rotational motion of the vehicle before 
matching can be done between features in image /* and image h+i- However, by definition, 
the EKF accounts for total helicopter motion and due to its continuing update no derotation 
is necessary with the matching algorithm discussed here. 

The state error covariance matrix P provides a measure of the accuracy of the estimate. 
The magnitude of P{k) decreases monotonically as new measurements [u(k), v(fc)] are pro- 
vided to the Kalman filter by the optical flow calculations. The amount of decrease in P{k) 
over a finite number of image sequences, N, depends on the measurement noise R(k) and N. 
Let T c be the amount of time required for the estimates to converge to a satisfactory value 
and let A T m be the interval between measurements. We have T c = N.AT m . Increasing AT m 
to A7i leads to an increase in the amount of change in [u(fc),v(fc)] , and assuming the same 
error in the location of [u(k + 1 ),v{k + 1)], results in a higher signal to noise ratio. We may 
need N x < N images to reach the same level of convergence. The corresponding value of 
T c is Ti = TVj.ATi. Similarly, decreasing AT m to AT 2 leads to a decrease in the amount of 
change in [tt(fc), v(k)] , and assuming the same error in the location of [u(k + 1), v(k + 1)], 
results in a lower signal to noise ratio. We may need N 2 > N images to reach the same level 
of convergence. The corresponding value of T c is Tj = A^-AT-j. T c is a nonlinear function 
of AT m and has a minimum for a specific value of AT m . The variation of T c for simulated 
images shows a large decrease in T c with decreasing AT m upto a certain value, followed by a 
small decrease in T c with decreasing AT m [12] . For small values of AT m , the number of sam- 
ples required before the estimates converge is very large simply because of the poor geometry 
associated with each successive pairs of images. For large values of A T m the time required 
for convergence may be prohibitive due to A T m itself. The lower bound on A7* m depends 
on the amount of time required for low-level image processing. The choice of A T m depends 
on the change in [u(Jb),u(Jfc)j in successive images. As can be seen from equations (5) and 
(6), optical flow depends on the rotorcraft motion relative to the object. Larger values of 
AT m are needed for low rotorcraft speed in order to maintain the same signal to noise ratio 
in measurements. For the same vehicle maneuver, range estimates to objects far away from 
the FOE will converge faster than objects closer to the FOE. This suggests a strategy where 
the features are tracked at a high sampling rate as dictated by the minimum value of AT m ; 
but, designing the Kalman filter to employ a variable measurement update interval which 
depends on the magnitude of the displacement of the feature within the image. We are in 
the process of designing a event driven Kalman filter where a new measurement is accepted 
for update only if it passes a signal to noise threshold. This would result in a uniform signal 
to noise ratio over a large portion of the FOV . Our evaluation will focus on three different 
values for AT m . 
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5 Performance Evaluation 


The evaluation of algorithm performance has not received a lot of attention in computer 
vision reasearch. However, it is essential that algorithms be tested extensively before the 
technology can be transferred to applications. One reason for the lack of detailed evaluation 
is the absence of standard data sets of images. Recently, efforts are being made to collect a 
standard data set of images for testing algorithms. Another difficulty in the evaluation of 
algorithms is the need for an evaluation criterion. Visualization of the enormous amounts 
of information poses another challenge in the debugging and display of results. These is- 
sues have been recognized by researchers in computer vision and are the subject of several 
discussions [34, 35, 36]. We will describe the steps followed in the evaluation of the range 
estimation algorithm and relate our experience to problems and issues in the evaluation of 
computer vision algorithms. 

The evaluation of rotorcraft obstacle detection algorithms needs to be done using both 
laboratory image sequences and imagery data acquired during rotorcraft flight. NASA Ames 
has developed a set of flight data [37]. In the laboratory an experiment has been designed 
to acquire a sequence of images by mounting a camera on a 3 degree-of-freedom motion 
table. The camera position and orientation is controlled by a computer to achieve desired 
curvilinear camera motion. The details about the laboratory setup can be found in [38]. 
The camera can be moved at different speeds. Figure 6 shows the laboratory setup and the 
location of various objects. A sequence of 80 images was collected by moving the camera 
along a straight line during the segment AB followed by a cosine curve during the segment 
BC. Figure 7 shows the details of the camera path. The camera is stationary when the image 
is captured and is moved to the next position before taking the next image. The camera 
moves a distance of 0.125in along the direction AB between steps. The optical axis of the 
camera is tangential to the path ABC. The camera motion parameters were computed from 
the path ABC and provided as inputs to equation (25). This experiment can be scaled to 
simulate a helicopter flying at 20 knots where the objects in the FOV vary from 50m to 
500m and the images are being processed for range computation every 0.25 seconds. 

The position of the camera when the 36th, 60th, and 76th image is captured is referred 
to ais camera position 1, 2, and 3 respectively in subsequent discussion. These positions are 
indicated in Figure 7. Figures 8, 9, 10 and 11 show the zeroth, the 36th, the 60th and the 
76th image in the sequence of 80 images. Five different regions are highlighted in the im- 
ages. In Figure 9, these correspond to the edge of a bracket at 15.2 inches from the camera, 
a region which includes the FOE on a soda can about 22.7 inches from the camera, a pencil 
to the right of FOE and located at about 19.5 inches, another pencil to the left of FOE and 
located at about 24.5 inches, and a tape located at 56 inches from the camera. These regions 
were selected to examine the behavior of the algorithms at different ranges and at different 
angles with respect to the direction of motion of the sensor. 

It is difficult to state a single measure for the evaluation of range estimation algorithms. 
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Figure 6: Laboratory setup and location of objects 
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Figure 8: Zeroth image 



Figure 10: 60th image 


Figure 9: 36th image 


Figure 11: 76th image 
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If the true terrain data is available, an obvious measure is the accuracy of the estimate at 
each pixel location. The usefulness of the range information depends on the accuracy at 
different locations in the field-of-view (FOV) as well as regions in the image where range 
measurements are available. The desired accuracy and the critical regions in the FOV will 
vary with the obstacle avoidance algorithm and its implementation. For regions in the FOV 
where no true terrain data is available, the covariance of the range estimate can be used as a 
measure of the quality of the estimate. The accuracy of Kalman filter-based range estimates 
depends on the error in individual measurements at each frame and the number of frames 
used to estimate the range. The convergence of a Kalman filter depends on several different 
factors and we will explore the relationship between accuracy, time interval between images 
and the number of images. 

We adopted the following procedure in the evaluation of the algorithms: 

1. Compute range to all tracked features in the image and store their range and covariance 
as a function of the identification number of the feature and image sequence. 

2. The range estimates at various points in the FOV are color-coded and superimposed 
on the original image. Since different features corresponding to the same object should 
have approximately the same range, the color-coding gives a global picture of the range 
algorithm performance. 

3. Manually select regions in the image which correspond to the same object and for 
which truth data is available 

4. Compute the mean and standard deviation of the estimated range corresponding to all 
the features in a given region 

5. Examine the performance of the algorithm as a function of sampling time and the 
number of samples used to estimate the range. 

Next, we present range estimation results using the laboratory data following the pro- 
cedure outlined above. Features were detected in these images using algorithms described in 
section 3. The number of features in the image sequence varied from 600 to 1000. A physical 
object such as a pencil in the FOV gives rise to several features and their associated image 
tracks. Figure 12 shows the image tracks generated by four features corresponding to differ- 
ent parts of the right pencil. The feature tracks are of varying length since new features are 
created and some old features may either move out of the FOV or cannot be tracked when 
a new image is acquired for processing. Table 1 shows the identification number associated 
with the some of the features corresponding to the right pencil, the respective image plane 
coordinates and the axial component of range estimate (z t ) at the 80th image sequence. The 
identification number is used as a link to associate all the information and display relating 
to a single feature. Figure 13 shows the variations in the axial component (z t ) of the edge 
corresponding to the right pencil as a function of the number of iterations. The object 
position converges to the true value as the number of iterations increase. All the features 
corresponding to the same object will exhibit similar convergence and the plot can be used 
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Figure 12: Image tracks generated by 4 features corresponding to the right pencil 

to identify feature tracks which do not correspond to a real object on the ground. Figure 14 
shows a histogram of the estimated range using features corresponding to the right pencil. 
This histogram was computed at the end of the 76th image and has 98 features. Recall 
that the camera is moving along a straight line when the first 40 images were acquired and 
the camera has both translational and rotational velocity during the time when the next 
40 images are acquired. The time interval between images is fixed. This image sequence 
was processed in three different ways. Data set F uses all the 80 images. Data set H uses 
alternate images from data set F and has 40 images. Data set Q is a subset of data set H 
and is made up of 20 alternate images. Figure 15 illustrates the relationship between the 
three data sets. The sensor motion along the line AB between two images in data sets F, H, 
and Q is 0.125in, 0.25in, and 0.5in respectively. The three data sets provide us with three 
different measurement update intervals, slow, moderate and fast, in the estimation of range. 


We examine the range estimates using datasets F, H and Q respectively in the region 
corresponding to the Bracket. Figure 16. a, 16. b and 16.c show range estimates as a function 
of frame number using datasets F, H and Q respectively. It should be recalled that the 
number of range updates is equal to frame number/M where M takes the values 1 , 2 and 
4 for the datasets F, H and Q respectively. The dotted lines in the figure show the 5% 
estimation error boundaries. The estimation error converges to within 5% of the true value 
faster in dataset F. This is due to the fact that the disparity between images is significant 
in the dataset F in the region corresponding to the Bracket. At frame number 20, the range 
has been updated 20, 10 and 5 times in Figure 16.a, 16.b and 16.c respectively. Figure 17. a, 
17. b and 17.c shows similar results for the Right Pencil. The estimation error performance 
of the Right Pencil is similar to the Bracket. Figures 18.a, 18.b and 18. c show the range 
estimation results in the region corresponding to the can. The can represents a region in 
the image very close to the FOE. The range estimation results are not very reliable in this 
region. However, as can be seen from Figure 18.c the range estimation errors are smaller 
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Figure 13: Variation in range with number of iterations (right pencil) 


Table 1: Identification number and range 


id | u (pixels) | v (pixels) | range (in.) | 


143 

465.77 

274.98 


1357 

462.59 

280.16 

i|ig? fulfil 

1643 

448.16 

278.32 


1931 

445.84 

279.88 

12.50 

2064 

449.91 

263.81 

13.36 

2110 

444.65 

267.51 

13.72 

2162 

445.89 

264.24 

12.85 

2165 

459.54 

279.21 

13.09 

2280 

448.78 

276.31 

13.85 

2872 

456.10 

262.25 

17.26 

3371 

457.67 

271.29 

14.82 

3419 

456.46 

271.13 

15.78 

3535 

444.30 

271.10 

15.09 

3637 

456.21 

270.35 

14.69 

3788 

444.42 

268.47 

15.51 


20 









Figure 14: Histogram of range estimates (Right pencil, data set F, 76th image) 


Data sat F 
Data sat H 
Data sat Q 



Figure 15: Relationship between data sets F, H, and Q 
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for dataset Q than the other two sets. Figure 19 shows the range estimation results in the 
region corresponding to the Left Pencil. The range estimation results agree well with the 
true values. The estimation errors are smallest for the moderate update interval. 
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Framt Number 


Figure 16: (a) Range Estimation in the Bracket Region (Dataset F) 

Figures 20.a, 20.b and 20.c show the range estimation results in the region correspond- 
ing to the Tape. The region surrounding the tape was selected to examine the behavior of 
the algorithm at distances far away from the sensor. The changes in disparity of a feature 
corresponding to the tape using data set F, data set H and data set Q are shown in Figure 
21. The estimation errors are smallest for moderate update intervals. The reason for the 
poor performance of the algorithm using the small A T m is the disparity between images is 
very small. This results in a very low signal to noise ratio in the measurements resulting 
from the optical flow algorithm using data set F. At large update intervals, the number of 
updates for a given time is small. 

The above results indicate that it is possible to estimate range to an accuracy of 5 
to 10 percentage. In problems involving large variations of optical flow within the same 
image, the selection of the measurement update interval affects the estimation accuracy. 
This problem can be addressed by a variable update Kalman filter where the update rate 
is chosen depending on the disparity at a feature location. The main source of error in 
z is due to errors in camera calibration and knowledge of the camera co-ordinate system 
geometry with respect to the world-axis located on the motion table. The errors can be 
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Figure 16: (b) Range Estimation in the Bracket Region (Dataset H) 



Figure 16: (c) Range Estimation in the Bracket Region (Dataset Q) 
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Figure 17: (a) Range Estimation in the Right Pencil Region (Dataset F) 
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Figure 17: (b) Range Estimation in the Right Pencil Region (Dataset H) 


24 




*0* 




0 *0 40 00 

Frame Number 


•o 


Figure 17: (c) Range Estimation in the Right Pencil Region (Dataset Q) 



Figure 18: (a) Range Estimation in the Can Region (Dataset F) 
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Figure 18: (c) Range Estimation in the Can Region (Dataset Q) 
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Figure 19: Range Estimation in the Left Pencil Region (Dataset H) 
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Figure 20: (a) Range Estimation in the Tape Region (Dataset F) 
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Figure 20: (b) Range Estimation in the Tape Region (Dataset H) 
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Figure 20: (c) Range Estimation in the Tape Region (Dataset Q) 
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Data from "tapa” 



Figure 21: Motion in the image plane between images using data sets F, H, and Q 

further reduced by camera calibration which will reduce the uncertainty in the location of 
the camera co-ordinate system with respect to the motion table. 


6 Summary and Future Work 

We have presented a recursive method to estimate range to objects using a sequence of im- 
ages. The components of our algorithm were based on the characteristics of low-altitude 
rotorcraft flight. They are (i) natural scenarios, (ii) curvilinaer motion, (iii) large variation 
in the dynamic range of the optical flow, and (iv) the availability of sensor motion param- 
eters. These characteristics made it impossible to directly use several algorithms reported 
in the computer vision literature. The method was evaluated extensively using real images 
in a laboratory setup and produces good range estimates. The performance of the method 
can be improved further by camera calibration to remove inaccuracies in the determination 
of the image plane. Recently, we have tested the algorithms on several scenarios containing 
images from a CH-47 helicopter flight. The algorithm performs very well and the results will 
be reported in a forthcoming paper. 

We are considering the use of a Kalman filter with variable time intervals between esti- 
mate updates to deal with the wide dynamic range of optical flow in an image. Like most 
ranging algorithms depending on motion, the performance of the algorithm is poor close to 
the FOE. This problem is being addressed by the use of stereo and by integrating stereo and 
motion in a Kalman filter formulation. The method needs further evaluation using several 
different image sequences to test its robustness. 

So far, we have not addressed the real-time implementation issues. We plan to parallelize 
our algorithms for implementation on a parallel machine such as Intel’s iWARP. Alterna- 
tively, we are exploring ways to reduce the number of features per object by supplementing 
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feature tracking with knowledge about contiguous objects/surfaces. 

Feature detection methods provide a sparse range map in the FOV. They have to be 
supplemented by intelligent algorithms such as context dependent scene analysis [39] to fill 
the gaps in the range maps. These range maps may have to be further interrogated by an 
active system before the range information can be used by the obstacle avoidance system. 

The research reported in this paper is part of an on-going effort at NASA Ames to de- 
velop technologies for the automation of rotorcraft low altitude flight. The object detection 
and range estimation algorithms discussed are quite general and have potential applications 
in robotics and autonomous navigation of vehicles. The laboratory and flight image se- 
quences together with the optical flow and recursive estimation software can be requested 
from NAS A/ Ames. In addition to these feature-based algorithms, there are parallel efforts 
to investigate field-based techniques for the same range estimation applications [19, 20]. 


7 Appendix 

This appendix derives analytical expressions for the discrete equivalent of the continuous 
system in equation (25). Assume the linear and angular velocity of the sensor (V;,u>,) to be 
constant over a small interval of time A7\ For a time-invariant matrix F, the state transition 
matrix is given by the equation 

$(t) = exp(Ff) = I + Ft + ^i^t 2 + + (33) 

$(*) is evaluated numerically [33]. Because F is a 3 x 3 matrix, we can derive an 
expression for $(t) by working in the frequency domain. We have, 

$ = £ _1 [si — F] -1 (34) 

where s is the Laplace transform variable and £ _1 is the inverse Laplace transform. In the 
present case, 

F = [u;.(<)]|t=i.AT (35) 

We have 


[»/ -f] = 


S UJ SZ UJ 9 y 

lj $2 S M sx 

M S y Max & 


Further, 

[si - F]~ l = Adj[sl - F]/det[sI - F] 
where / is the 3x3 identity matrix, 

dtt[sl — F] = s(s 2 + a 2 ) 


(36) 


(37) 

(38) 


30 



and 


(39) 


a 2 =u&+u4+<4 


The adjoint matrix is given by the equation 


■5 2 + u\ x —SU> az + LO ax U ay SU) ay + U az U 3x 

Adj[sl-F]= scu az + u> ax u: ay s 2 + u> 2 y -su ax + w,„w J2 

_ -SU) ay + W„W„ SU ax + U ay U az S 2 + 0J az 




Let $ij be the ijth element of the 3x3 state transition matrix. <h, 7 is the Laplace inverse 
of the ijth element of the matrix [s/ — F] -1 . Then, using equations (37), (38) and (40), we 
have ^ 

a r-l s .2 i /•„. a r r\ ia-\\ 


where 


1 — cos(aAT) 


Similarly, 


where 


12 

= 

aui ax ui ay + bu> az 

13 

= 

au} az ui ax — boj ay 

21 

— 

au! ax ui ay — bu> ax 

22 

= 

ooj 2 v + cos (a AT) 

23 

= 

au ay u az + bu ax 

31 

= 

o^sz^tx "I" 

32 

= 

au> ay u> az — bu>tx 

33 

= 

au> az + cos (a AT) 


sin(aAT) 


Assuming the input U(t) = K(t) to be constant during the interval, the input distribution 
matrix T in equation (15) can be computed using the equation 


r(/) = /* T $(t)G{t)dt 

Jo 


Since <?(*) = 7, 


r(/) = / mdt (46) 

JO 

Using the values for 4»(t) from equations (41) and (44) in equation (46), and defining 

_ AT sin(aAT) 
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we have 


(48) 


r n = cu] x + sin(aAT)/a 
r 12 = cw sx u t y + aw M 

r i3 = cw >z u) ax - au>sy 

r 21 = ao, x u>, y - au, z 
r 22 = Ojj] y + sin(a AT) /a 

r 2 3 — — CLOgyLOgz 

r 3 i = cu> tz u), x + am,y 

r 3 2 = OjJ t yUJ$ Z OWjx 

T33 = cu, x + sin(aAT)/a 
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